Fast Winner Search for SOM-Based Monitoring and Retrieval of High-Dimensional Data
نویسنده
چکیده
Self-Organizing Maps (SOMs) are widely used in engineering and data-analysis tasks, but so far rarely in very large-scale problems. The reason is the amount of computation: while small SOMs can be computed starting from the basic principles, rapid computation of large maps of high-dimensional data requires special methods. Winner search, nding the position of a data sample on the map, is the computational bottleneck: comparison between the data vector and all of the model vectors of the map is required. In this paper a method is proposed for reducing the amount of computation by restricting the search to certain small-dimensional subspaces of the original space. The method is suitable for applications in which the map can be computed oo-line, for instance in data monitoring, classi-cation, and information retrieval. In a case study with the WEBSOM system that organizes text document collections on a SOM, the amount of computation was reduced to about 14% of the original, and even to 6.6% when approximations were utilized.
منابع مشابه
یک روش مبتنی بر خوشهبندی سلسلهمراتبی تقسیمکننده جهت شاخصگذاری اطلاعات تصویری
It is conventional to use multi-dimensional indexing structures to accelerate search operations in content-based image retrieval systems. Many efforts have been done in order to develop multi-dimensional indexing structures so far. In most practical applications of image retrieval, high-dimensional feature vectors are required, but current multi-dimensional indexing structures lose their effici...
متن کاملSimilarity Retrieval Based on SOM-Based R*-Tree
Feature-based similarity retrieval has become an important research issue in multimedia database systems. The features of multimedia data are useful for discriminating between multimedia objects (e g documents, images, video, music score, etc.). For example, images are represented by their color histograms, texture vectors, and shape descriptors, and are usually high-dimensional data. The perfo...
متن کاملA Margin-based Model with a Fast Local Searchnewline for Rule Weighting and Reduction in Fuzzynewline Rule-based Classification Systems
Fuzzy Rule-Based Classification Systems (FRBCS) are highly investigated by researchers due to their noise-stability and interpretability. Unfortunately, generating a rule-base which is sufficiently both accurate and interpretable, is a hard process. Rule weighting is one of the approaches to improve the accuracy of a pre-generated rule-base without modifying the original rules. Most of the pro...
متن کاملSOM-Based R*-tree for Similarity Retrieval
Feature-based similarity retrieval has become an iniportant research issue in multimedia database systems. The features of multimedia data are useful for discriminating between multimedia objects (e.g., documents, images, video, music score, etc.). For example, images are represented by their color histograms, texture vectors, and shape descriptors. A feature vector is a vector that represents ...
متن کاملA Monte Carlo-Based Search Strategy for Dimensionality Reduction in Performance Tuning Parameters
Redundant and irrelevant features in high dimensional data increase the complexity in underlying mathematical models. It is necessary to conduct pre-processing steps that search for the most relevant features in order to reduce the dimensionality of the data. This study made use of a meta-heuristic search approach which uses lightweight random simulations to balance between the exploitation of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 1999